Open In Colab

Emotion AI¶

Nutshell¶

As of now this project is under development.

In this project I build a program that classifies emotions from images of human faces, as explained on the course Modern Artificial Intelligence, lectured by Dr. Ryan Ahmed, Ph.D. MBA.

The data set I use is from https://www.kaggle.com/c/facial-keypoints-detection/overview and consists of over 20000 facial images that have been labeled with facial expression/emotion and approximately 2000 images with their keypoint annotations.

The program will train two models which will detect

  1. facial keypoints
  2. detect emotions.

Then these models are combined into one model that will provide a holistic prediction of the emotion as the output.

A short recap of artificial neuronal networks¶

Artificial neurons are built in a similar way as human neurons. The artificial neurons take in signals through input channels (dendrites in human neurons) and processes information through transfer functions (cell bodies) and generates an output (which would travel through the axon of a neuronal cell).

No description has been provided for this image
No description has been provided for this image

Fig. 1. Side by side view of artificial and biological neurons. Credit: Top image from Introduction to Psychology (A critical approach) Copyright © 2021 by Rose M. Spielman; Kathryn Dumper; William Jenkins; Arlene Lacombe; Marilyn Lovett; and Marion Perlmutter licensed under a Creative Commons Attribution 4.0 International License. Bottom image Chrislb, CC BY-SA 3.0 , via Wikimedia Commons

For example lets consider an artificial neuron (AN) that takes three inputs: $x_1$, $x_2$, and $x_3$. We can then express the output of the artificial neuron mathematically as $y = \phi(X_1W_1 + X_2W_2 + X_3W_3 + b)$. Here $y$ is the output and the $W$s are the weights assigned to each input signal. $b$ is a bias term added to the weighted sum of inputs. $\phi$ is the activation function.

Some common modern activation functions used in neural networks are for example ReLU, GELU and the logistic activation function. ReLU is short for Rectified linear unit function and is defined as $\phi(x) = max(0,\alpha + x'b)$. ReLU is recommended for the hidden layers, since it outputs a linear response for positive values. This helps maintain larger gradients and makes training deep networks more feasible.

The Gaussian Error Linear Unit (GELU) is a smoother version of the ReLU and is defined as $x\phi(x)$, where the $\phi(x)$ stands for Gaussian cumulative distribution function.

The logistic activation function is also called sigmoid function and is defined as $\phi(x) = \frac{1}{1+e^{-x}}$. It takes a number and sets it between 0 and 1 and thus is very helpful in output layers.

No description has been provided for this image

Training¶

All neural networks need to be trained with labeled data. The available data is generally devided to 80% training and 20% testing data. It is also recommended to further divide the training data into an actual training data set (e.g. 60%) and a validation data set (e.g. 20%).

Training is done by adjusting the weights of the network, by iteratively minimising the cost function using for example the gradient descent optimization algorithm. It works by calculating the gradient of the cost function and then takes a step to the negative direction until it reaches the local or global minimum.

A typical choice for a cost function is the quadratic loss, which is formulated as $f_{loss}(w,b)= \frac{1}{N}\sum^n_{i=1}(\hat y-y)$.

Gradient descent algorithm:

1. Calculate the derivative of the loss function $\frac{\delta f_{loss}}{\delta w}$

2. Pick random values for weights and substitute.

3. Calculate the step size, i.e. how much we will update our weights.

step size = learning rate * gradient $=\alpha*\frac{\delta f_{loss}}{\delta w}$

4. Update the parameters and repeat.

new weight = old weight - step size $w_{new}=w_{old}-\alpha*\frac{\delta f_{loss}}{\delta w}$

Below is an example for searching the minimum of a u-shaped funciton with gradient descent. Usually the situation is mulidimensional but the simplification is solved in a similar way.

No description has been provided for this image

Testing various learning rates helps undestand the importance of choosing the parameters of training.

No description has been provided for this image

As shown above too large learning rate can lead to missig the global minimum and/or the model does not converge as quickly. Equally problematic can be too small learning rates when the model does not learn. To solve the problems rising from too small or too large learning rates there are several approaches to adjust the learning rates dynamically.

Momentum is analogous to the balls tendency to keep rolling down hill. Momentum is used to speed up the learning when the error cost gradient is heading in the same direction for a long time, and slow down when a leveled area is reached. Momentum is controlled by a variable that is analogous to the mass of the ball rolling. A large momentum helps avoiding getting stuck in local minima, but might also push through the minima we wish to find. Thus, the parameter has to be selected carefully.

Learning rates can also be adjusted through decay, which basically reduces the learning rate by a certain amount after a fixed number of epochs. It can help solve above like situations, where too great learning rate makes the learning jump back and forth over a minimum.

Adagrad or Adam are examples of popular adaptive algorithms for optimising the gradient descent.

Network architectures¶

The artificial neurons are connected to each other to form neural networks and a plethora of different network architectures exist. To harness the power of AI, it is necessary to know which architecture serves the intended purpose best. Below are three common architectures and their applications.

Recurrent Neural Networks (RNNs) handle sequential data by maintaining a hidden state that captures information about previous elements in the sequence. Therefore they are great for contexts where the output depends on previous inputs, for example time series and natural language processing.

Generative Adversial Networks (GANs) consist of two neural networks - the Generator and the Discriminator. They sparr each other in a zero-sum game framework, where the genrator creates synthetic data that resembles real data and the discriminator evaluates whether it is rela or not. This dirves the generator to output increasingly realistic data. Obviously, this is the choice for many image generation and editing but also for anomaly detection in industiral and security contexts. GANs can model regular patterns and subsequently detect anomalies by comparing generated outputs with real inputs.

Convolutional Neural Networks (CNN) are designed to process data with a grid-like topology and are most commonly used in image analysis. They utilise convolutional layers to learn spatial hierarchies by applying filters (kernels) that slide (convolve) over the input. They usually involve pooling layers that reduce the spatial dimensions and fully connected layers that map the extracted features to outputs.

No description has been provided for this image

Fig. 2. Convolutional neural network. Credit: Aphex34, CC BY-SA 4.0, via Wikimedia Commons

In the Emotion AI, I will use the Residual network (ResNet), which is a Residual Neural Network. Resnet's architecture includes "skip connection" features which enables training very deep networks wihtout vanishing gradient issues. Vanishing gradient problems occurs when the gradient is back-propagated to earlier layers and the resulting gradient is very small.The skip connection feature works by passing the input of one layer to a layer further down in the network. This is also called identity mapping. The ResNet model that I use has been pretrained with the ImagNet dataset.

No description has been provided for this image

Fig. 3. Identity mapping. Credit: LunarLullaby, CC BY-SA 4.0, via Wikimedia Commons

Part 1. Key facial points detection¶

In this section I program the DL model with convolutional neural network and residual blocks to predict facial keypoints. The data set is from https://www.kaggle.com/c/facial-keypoints-detection/overview.

The dataset consists of input images with 15 facial key points each. The training.csv file has 7049 face images with corresponding keypoint locations. The test.csv file has face images only, and will be used to test the model. The images are strings of numbers in the shape of (2140,). That has to be transformed into the real shape of the images (96, 96). Thus we create a 1-D array of the string and reshape it to 2D array.

The model I build will have the architecture presented below. The Resblock consists of two different type of blocks: Convolution block and identity block. As seen below, both of them have an additioinal short path to add the original input to the output. For the Covolution block this includes few extra steps to shape the input to the same dimensions as the output from the longer path.

Final model architecture Resblock architecture
key_points_df['Image'].shape
key_points_df['Image'][0]
type(key_points_df['Image'][0])

key_points_df['Image'] = key_points_df['Image']. apply(lambda img: np.fromstring(img, dtype = int, sep = ' ').reshape(96,96))
key_points_df['Image'][0].shape
(96, 96)
key_points_df.describe()
left_eye_center_x left_eye_center_y right_eye_center_x right_eye_center_y left_eye_inner_corner_x left_eye_inner_corner_y left_eye_outer_corner_x left_eye_outer_corner_y right_eye_inner_corner_x right_eye_inner_corner_y ... nose_tip_x nose_tip_y mouth_left_corner_x mouth_left_corner_y mouth_right_corner_x mouth_right_corner_y mouth_center_top_lip_x mouth_center_top_lip_y mouth_center_bottom_lip_x mouth_center_bottom_lip_y
count 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 ... 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000 2140.000000
mean 66.221549 36.842274 29.640269 37.063815 59.272128 37.856014 73.412473 37.640110 36.603107 37.920852 ... 47.952141 57.253926 63.419076 75.887660 32.967365 76.134065 48.081325 72.681125 48.149654 82.630412
std 2.087683 2.294027 2.051575 2.234334 2.005631 2.034500 2.701639 2.684162 1.822784 2.009505 ... 3.276053 4.528635 3.650131 4.438565 3.595103 4.259514 2.723274 5.108675 3.032389 4.813557
min 47.835757 23.832996 18.922611 24.773072 41.779381 27.190098 52.947144 26.250023 24.112624 26.250023 ... 24.472590 41.558400 43.869480 57.023258 9.778137 56.690208 32.260312 56.719043 33.047605 57.232296
25% 65.046300 35.468842 28.472224 35.818377 58.113054 36.607950 71.741978 36.102409 35.495730 36.766783 ... 46.495330 54.466000 61.341291 72.874263 30.879288 73.280038 46.580004 69.271669 46.492000 79.417480
50% 66.129065 36.913319 29.655440 37.048085 59.327154 37.845220 73.240045 37.624207 36.620735 37.920336 ... 47.900511 57.638582 63.199057 75.682465 33.034022 75.941985 47.939031 72.395978 47.980854 82.388899
75% 67.332093 38.286438 30.858673 38.333884 60.521492 39.195431 74.978684 39.308331 37.665280 39.143921 ... 49.260657 60.303524 65.302398 78.774969 35.063575 78.884031 49.290000 75.840286 49.551936 85.697976
max 78.013082 46.132421 42.495172 45.980981 69.023030 47.190316 87.032252 49.653825 47.293746 44.887301 ... 65.279654 75.992731 84.767123 94.673637 50.973348 93.443176 61.804506 93.916338 62.438095 95.808983

8 rows × 30 columns

We perform a sanity check for the data by visualising 64 randomly chosen images along with their key facial points.

No description has been provided for this image

Image augmentation¶

Here we create an additional data set where the images are changed slightly to improve the generalisation of the final AI model. We want more data and more variability in e.g. orientation, lighting conditions, or size of the image. This will reduce the likelihood of overfitting and ensuring that the model learns the meaningful "concepts" of emotion recognition. We create this extra data set by creating a copy of the original data set and tweaking it.

I will create 3 types of augmented images:

  1. horisontal flipping
  2. randomly increasing brightness
  3. vertical flipping
(4280, 31)
No description has been provided for this image
(6420, 31)
No description has been provided for this image
(8560, 31)
No description has been provided for this image
# ------------------------------------------------------------
#   • key_points_df  – original DataFrame (images + 30 landmark columns)
#   • columns        – landmark column list [x1, y1, x2, y2, …, x15, y15]
#   • augmented_df   – NumPy array already containing the flip-augmented data
# ------------------------------------------------------------

# 1. Image centre (all images are 96 × 96)
h, w = key_points_df['Image'].iloc[0].shape
cx, cy = w / 2, h / 2

# 2. Make a working copy
key_points_df_rot = key_points_df.copy()

# 3. Rotate every row with its *own* random angle 5 – 90 °
for idx in key_points_df_rot.index:
    angle = random.randint(5, 80)                 # unique angle
    theta = np.deg2rad(angle)
    cos_t, sin_t = np.cos(theta), np.sin(theta)

    # 3a. rotate the image (negative angle ⇒ correct screen direction)
    img = key_points_df.at[idx, 'Image']
    key_points_df_rot.at[idx, 'Image'] = rotate(
        img, -angle, reshape=False, mode='nearest')

    # 3b. rotate all key-points
    for i in range(0, len(columns), 2):
        xcol, ycol = columns[i], columns[i + 1]
        x = key_points_df.at[idx, xcol]
        y = key_points_df.at[idx, ycol]
        x0, y0 = x - cx, y - cy
        key_points_df_rot.at[idx, xcol] = x0 * cos_t - y0 * sin_t + cx
        key_points_df_rot.at[idx, ycol] = x0 * sin_t + y0 * cos_t + cy

# 4. Convert the rotated DataFrame to ndarray and stack it
rot_array = key_points_df_rot.to_numpy()
augmented_df = np.concatenate((augmented_df, key_points_df_rot), axis=0)


# Print the original and flipped image side by side
img = 1900

fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(7, 7))
axes[0].imshow(key_points_df['Image'][img],cmap='grey')
for j in range(1, 31, 2):
    axes[0].plot(key_points_df.iloc[img,j-1],key_points_df.iloc[img,j],marker='.', color='r')

axes[1].imshow(key_points_df_rot['Image'][img],cmap='grey')
for j in range(1, 31, 2):
    axes[1].plot(key_points_df_rot.iloc[img,j-1],key_points_df_rot.iloc[img,j],marker='.', color='r')
fig.show()
augmented_df.shape
(10700, 31)
No description has been provided for this image

Data normalization and scaling¶

I normalize the image pixel values to range 0 - 1. This generates better results in neural networks.

# Obtain the x and y coordinates to be used as target
img_target = augmented_df[:,:30]
img_target = np.asarray(img_target).astype(np.float32)
img_target.shape
(10700, 30)
# Split the data into train and test data
X_train_kp, X_test_kp, y_train_kp, y_test_kp = train_test_split(img_array, img_target, test_size=0.2, random_state=42)
X_train_kp.shape
(8560, 96, 96, 1)
X_test_kp.shape
(2140, 96, 96, 1)
y_test_kp.shape
(2140, 30)
y_train_kp.shape
(8560, 30)

Building the Residual Neural Network model for key facial points detection¶

Kernels are used to modify the input by sweeping it over the original input as shown in this animation:

2D Convolution Animation

Fig. 4 Performing a convolution on 6x6 input with a 3x3 kernel using stride 1x1. Credit: Michael Plotke, CC BY-SA 3.0, via Wikimedia Commons.

For example, we could perform a 2D convolution for our input with this command:

X = Conv2D(filters=64, kernel_size=(7,7), strides=(2,2), kernel_initializer = glorot_uniform(seed=0))(X_input)

Here we tell the function that we want to

  • use 64 distinct filters (each one is a trainable 7×7 “weight grid”).
  • use stride 2x2, i.e., the filter jumps 2 pixels at a time, effectively “skipping” every other location.
  • intialise the kernels with glorot_uniform method, aka Xavier uniform initialization. This draws samples from a uniform distribution within a specific range, which will be determined from the number of input and output units.

In this section I define the model architecture using Keras. Below is the code to generate Resblocks.

# @title Resblock

def res_block(X, filter, stage):
  """
  Implementation of the Resblock.

  Arguments:
  X -- input tensor
  filters -- tuple/list of integers, the number of filters for each conv layer (f1, f2, f3)
  stage -- integer, used to name the layers
  block -- string, used to name the layers uniquely within a stage

  Returns:
  X -- output of the res block
  """
  ### 1: Convolutional block###
  # Make a copy of the input
  X_shortcut = X

  f1, f2, f3 = filter

  # ----Long (main) path-----
  # Conv2d
  X = Conv2D(f1, kernel_size = (1,1), strides = (1,1), name=str(stage)+'convblock'+'_conv_a', \
             kernel_initializer = glorot_uniform(seed=0))(X)
  # MaxPool2D
  X = MaxPool2D(pool_size=(2,2))(X)
  # BatchNorm,ReLU
  X = BatchNormalization(axis = 3, name=str(stage)+'convblock'+'_bn_a')(X)
  X = Activation('relu')(X)

  # Conv2D (kernel 3x3)
  X = Conv2D(f2, kernel_size = (3,3), strides = (1,1), padding = 'same', name=str(stage)+'convblock'+'_conv_b', \
            kernel_initializer = glorot_uniform(seed=0))(X)
  # BatchNorm, ReLU
  X = BatchNormalization(axis = 3, name=str(stage)+'convblock'+'_bn_b')(X)
  X = Activation('relu')(X)

  #Conv2D
  X = Conv2D(f3, kernel_size = (1,1), strides = (1,1), name=str(stage)+'convblock'+'_conv_c', \
             kernel_initializer = glorot_uniform(seed=0))(X)
  #BatchNorm, ReLU
  X = BatchNormalization(axis = 3, name=str(stage)+'convblock'+'_bn_c')(X)


  # ----Short path----

  # Conv2D
  X_shortcut = Conv2D(f3, kernel_size = (1,1), strides = (1,1), name=str(stage)+'convblock'+'_conv_short', \
                      kernel_initializer = glorot_uniform(seed=0))(X_shortcut)
  # MaxPool2D and Batchnorm
  X_shortcut = MaxPool2D(pool_size=(2,2))(X_shortcut)
  X_shortcut = BatchNormalization(axis = 3, name=str(stage)+'convblock'+'_bn_short')(X_shortcut)


  # ----Add Paths together----
  X = Add()([X, X_shortcut])
  X = Activation('relu')(X)

  ### 2: Identity block 1 ###
  # Save the input value (shortcut path)
  X_shortcut = X
  block = 'iden1'
  # First component: Conv2D -> BatchNorm -> ReLU
  X = Conv2D(f1, (1, 1), strides=(1, 1), name=str(stage) + block + '_conv_a', \
             kernel_initializer=glorot_uniform(seed=0))(X)
  X = BatchNormalization(axis=3, name=str(stage) + block + '_bn_a')(X)
  X = Activation('relu')(X)

  # Second component: Conv2D (3x3) -> BatchNorm -> ReLU
  X = Conv2D(f2, (3, 3), strides=(1, 1), padding='same', name=str(stage) + block + '_conv_b', \
             kernel_initializer=glorot_uniform(seed=0))(X)
  X = BatchNormalization(axis=3, name=str(stage) + block + '_bn_b')(X)
  X = Activation('relu')(X)

  # Third component: Conv2D (1x1) -> BatchNorm
  X = Conv2D(f3, (1, 1), strides=(1, 1), name=str(stage) + block + '_conv_c', \
             kernel_initializer=glorot_uniform(seed=0))(X)
  X = BatchNormalization(axis=3, name=str(stage) + block + '_bn_c')(X)

  # Add shortcut value to the main path
  X = Add()([X, X_shortcut])
  X = Activation('relu')(X)

  ### 3: Identity block 2 ###
   # Save the input value (shortcut path)
  X_shortcut = X
  block = 'iden2'
  # First component: Conv2D -> BatchNorm -> ReLU
  X = Conv2D(f1, (1, 1), strides=(1, 1), name=str(stage) + block + '_conv_a', \
             kernel_initializer=glorot_uniform(seed=0))(X)
  X = BatchNormalization(axis=3, name=str(stage) + block + '_bn_a')(X)
  X = Activation('relu')(X)

  # Second component: Conv2D (3x3) -> BatchNorm -> ReLU
  X = Conv2D(f2, (3, 3), strides=(1, 1), padding='same', name=str(stage) + block + '_conv_b', \
             kernel_initializer=glorot_uniform(seed=0))(X)
  X = BatchNormalization(axis=3, name=str(stage) + block + '_bn_b')(X)
  X = Activation('relu')(X)

  # Third component: Conv2D (1x1) -> BatchNorm
  X = Conv2D(f3, (1, 1), strides=(1, 1), name=str(stage) + block + '_conv_c', \
             kernel_initializer=glorot_uniform(seed=0))(X)
  X = BatchNormalization(axis=3, name=str(stage) + block + '_bn_c')(X)

  # Add shortcut value to the main path
  X = Add()([X, X_shortcut])
  X = Activation('relu')(X)

  return X

Now that the Resblock is defined we can build the final model.

# @title Final Resnet Neural Network model

input_shape = (96,96,1)

# Input tensor shape
X_input = Input(input_shape)

# Zero-padding
X = ZeroPadding2D((3,3))(X_input)

# Stage 1
X = Conv2D(filters = 64, kernel_size = (7,7), strides = (2,2), name='conv1', \
           kernel_initializer = glorot_uniform(seed=0))(X)
X = BatchNormalization(axis = 3, name = 'bn_conv1')(X)
X = Activation('relu')(X)
X = MaxPooling2D((3,3), strides = (2,2))(X)

# Stage 2
X = res_block(X, filter =  [64, 64, 256], stage = 'res1')

# Stage 3
X = res_block(X, filter = [128,128,512], stage = 'res2')

# We could also add more resblocks if we want
# X = res_block(X, filter= [256,256,1024], stage= 'res3')

# Average pooling
X = AveragePooling2D((2,2), name = 'avg_pool')(X)

# Flatten
X = Flatten()(X)

# Dense, ReLU, Dropout
X = Dense(4096, activation = 'relu')(X)
X = Dropout(0.2)(X)
X = Dense(2048, activation = 'relu')(X)
X = Dropout(0.1)(X)
X = Dense(30, activation = 'relu')(X)

model_1_facialKeyPoints = Model(inputs = X_input, outputs = X)
Model: "functional_4"
****************************************************************************
┃ Layer (type)        ┃ Output Shape      ┃    Param # ┃ Connected to      ┃
****************************************************************************
| input_layer_1       | (None, 96, 96, 1) |          0 | -                 |
| (InputLayer)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| zero_padding2d_1    | (None, 102, 102,  |          0 | input_layer_1[0]… |
| (ZeroPadding2D)     | 1)                |            |                   |
+---------------------+-------------------+------------+-------------------+
| conv1 (Conv2D)      | (None, 48, 48,    |      3,200 | zero_padding2d_1… |
|                     | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| bn1                 | (None, 48, 48,    |        256 | conv1[0][0]       |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_19       | (None, 48, 48,    |          0 | bn1[0][0]         |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_5     | (None, 23, 23,    |          0 | activation_19[0]… |
| (MaxPooling2D)      | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 23, 23,    |      4,160 | max_pooling2d_5[… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_6     | (None, 11, 11,    |          0 | res2convblock_co… |
| (MaxPooling2D)      | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_a  | (None, 11, 11,    |        256 | max_pooling2d_6[… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_20       | (None, 11, 11,    |          0 | res2convblock_bn… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 11, 11,    |     36,928 | activation_20[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_b  | (None, 11, 11,    |        256 | res2convblock_co… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_21       | (None, 11, 11,    |          0 | res2convblock_bn… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 23, 23,    |     16,640 | max_pooling2d_5[… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 11, 11,    |     16,640 | activation_21[0]… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_7     | (None, 11, 11,    |          0 | res2convblock_co… |
| (MaxPooling2D)      | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_c  | (None, 11, 11,    |      1,024 | res2convblock_co… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_s… | (None, 11, 11,    |      1,024 | max_pooling2d_7[… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_6 (Add)         | (None, 11, 11,    |          0 | res2convblock_bn… |
|                     | 256)              |            | res2convblock_bn… |
+---------------------+-------------------+------------+-------------------+
| activation_22       | (None, 11, 11,    |          0 | add_6[0][0]       |
| (Activation)        | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_conv_a    | (None, 11, 11,    |     16,448 | activation_22[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_bn_a      | (None, 11, 11,    |        256 | res2iden1_conv_a… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_23       | (None, 11, 11,    |          0 | res2iden1_bn_a[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_conv_b    | (None, 11, 11,    |     36,928 | activation_23[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_bn_b      | (None, 11, 11,    |        256 | res2iden1_conv_b… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_24       | (None, 11, 11,    |          0 | res2iden1_bn_b[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_conv_c    | (None, 11, 11,    |     16,640 | activation_24[0]… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_bn_c      | (None, 11, 11,    |      1,024 | res2iden1_conv_c… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_7 (Add)         | (None, 11, 11,    |          0 | res2iden1_bn_c[0… |
|                     | 256)              |            | activation_22[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_25       | (None, 11, 11,    |          0 | add_7[0][0]       |
| (Activation)        | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_conv_a    | (None, 11, 11,    |     16,448 | activation_25[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_bn_a      | (None, 11, 11,    |        256 | res2iden2_conv_a… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_26       | (None, 11, 11,    |          0 | res2iden2_bn_a[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_conv_b    | (None, 11, 11,    |     36,928 | activation_26[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_bn_b      | (None, 11, 11,    |        256 | res2iden2_conv_b… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_27       | (None, 11, 11,    |          0 | res2iden2_bn_b[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_conv_c    | (None, 11, 11,    |     16,640 | activation_27[0]… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_bn_c      | (None, 11, 11,    |      1,024 | res2iden2_conv_c… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_8 (Add)         | (None, 11, 11,    |          0 | res2iden2_bn_c[0… |
|                     | 256)              |            | activation_25[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_28       | (None, 11, 11,    |          0 | add_8[0][0]       |
| (Activation)        | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 11, 11,    |     32,896 | activation_28[0]… |
| (Conv2D)            | 128)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_8     | (None, 5, 5, 128) |          0 | res3convblock_co… |
| (MaxPooling2D)      |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_a  | (None, 5, 5, 128) |        512 | max_pooling2d_8[… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_29       | (None, 5, 5, 128) |          0 | res3convblock_bn… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 5, 5, 128) |    147,584 | activation_29[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_b  | (None, 5, 5, 128) |        512 | res3convblock_co… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_30       | (None, 5, 5, 128) |          0 | res3convblock_bn… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 11, 11,    |    131,584 | activation_28[0]… |
| (Conv2D)            | 512)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 5, 5, 512) |     66,048 | activation_30[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_9     | (None, 5, 5, 512) |          0 | res3convblock_co… |
| (MaxPooling2D)      |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_c  | (None, 5, 5, 512) |      2,048 | res3convblock_co… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_s… | (None, 5, 5, 512) |      2,048 | max_pooling2d_9[… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_9 (Add)         | (None, 5, 5, 512) |          0 | res3convblock_bn… |
|                     |                   |            | res3convblock_bn… |
+---------------------+-------------------+------------+-------------------+
| activation_31       | (None, 5, 5, 512) |          0 | add_9[0][0]       |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_conv_a    | (None, 5, 5, 128) |     65,664 | activation_31[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_bn_a      | (None, 5, 5, 128) |        512 | res3iden1_conv_a… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_32       | (None, 5, 5, 128) |          0 | res3iden1_bn_a[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_conv_b    | (None, 5, 5, 128) |    147,584 | activation_32[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_bn_b      | (None, 5, 5, 128) |        512 | res3iden1_conv_b… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_33       | (None, 5, 5, 128) |          0 | res3iden1_bn_b[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_conv_c    | (None, 5, 5, 512) |     66,048 | activation_33[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_bn_c      | (None, 5, 5, 512) |      2,048 | res3iden1_conv_c… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_10 (Add)        | (None, 5, 5, 512) |          0 | res3iden1_bn_c[0… |
|                     |                   |            | activation_31[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_34       | (None, 5, 5, 512) |          0 | add_10[0][0]      |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_conv_a    | (None, 5, 5, 128) |     65,664 | activation_34[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_bn_a      | (None, 5, 5, 128) |        512 | res3iden2_conv_a… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_35       | (None, 5, 5, 128) |          0 | res3iden2_bn_a[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_conv_b    | (None, 5, 5, 128) |    147,584 | activation_35[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_bn_b      | (None, 5, 5, 128) |        512 | res3iden2_conv_b… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_36       | (None, 5, 5, 128) |          0 | res3iden2_bn_b[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_conv_c    | (None, 5, 5, 512) |     66,048 | activation_36[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_bn_c      | (None, 5, 5, 512) |      2,048 | res3iden2_conv_c… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_11 (Add)        | (None, 5, 5, 512) |          0 | res3iden2_bn_c[0… |
|                     |                   |            | activation_34[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_37       | (None, 5, 5, 512) |          0 | add_11[0][0]      |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| avg_pool            | (None, 1, 1, 512) |          0 | activation_37[0]… |
| (AveragePooling2D)  |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| flatten_1 (Flatten) | (None, 512)       |          0 | avg_pool[0][0]    |
+---------------------+-------------------+------------+-------------------+
| dense (Dense)       | (None, 5)         |      2,565 | flatten_1[0][0]   |
└---------------------┴-------------------┴------------┴-------------------┘
 Total params: 1,174,021 (4.48 MB)
 Trainable params: 1,165,445 (4.45 MB)
 Non-trainable params: 8,576 (33.50 KB)


  

Explanations of components¶

The Zeropadding adds a border of zeros (3 pixels wide) around the input image. This will prevent information loss at the edges of convolutions.

Conv2D is the cake of the convolutional layer. It applies the filters to the input image and slides them with a set stride. This way the features are extracted from the image.

The BatchNormalisation layer normalizes the output of the convolution, making training more stable. We can say it is the smooth cream layer on our convolution cake.

The ReLU activation function introduces non-linearity to the model.

MaxpPooling2D reduces the spatial dimensions of the feature maps by taking the maximum value in a window and so downsamples the output. After the Resblock, AveragePooling2D is used similar to MaxPooling, except it calculates the average value within the window. It also reduces the size of the feature maps. Just to give an impression of the impact of pooling, if we removed the MaxPooling 2D layers from Resblocks the final model would have 256 million parameters - instead of 18 million.

Flatten converts the multi-dimensional feature maps into a single, long vector, preparing the data for the fully connected layers.

Dense creates a fully connected layer where each neuron is connected to every neuron in the previous layer. These fully connected layers will process the features exrtacted by the convolutional layers.

Dropout layers are a regularisation technique which drops a set percentage of the neurons during training by setting them to zero. This makes the model less likely to overfit, and decreases the interdependency between the neurons. Therefore we improve the performance of the network and the generalisability of the model.

The final model has a very complex structure, 18 million trainable parameters, which allows it to learn to identify emotions as good or even better than average human. However, too many parameters can lead to problems, such as overfitting and slow or nonconverging training. Optimising this many parameters is not a trivial task.

Compiling and training the model¶

I will use the Adam optimization method for the training. Adam is a computationally efficient stochastic gradient method and it combines the gradient descent with momentum and the RMSP algorithm.

As discussed earlier, the momentum speeds the training by accelerating the gradients by adding a fraction of the previous gradient to the current one. The RMSP or Root Mean Square Propagation is an adaptive learning algorithm that takes the 'exponential moving average' of the gradients. In other words, it adapts the learning rate for each parameter by keeping track of an exponentially decaying average of past squared gradients.

The algortihm will proceed as follows:

1. Calculate the gradient $g_t$

$g_t = \frac{\delta L }{\delta w_t}$

2. Update the Biased first moment estimate $m_t$

$m_t = \beta_1 m_{t-1} + (1-\beta_1)g_t$

This is similar to calculating the momentum as we keep track of the decaying average of past gradients.

3. Update the Biased Second Moment Estimate $v_t$

$v_t = \beta_2 v_{t-1} + (1-\beta_2)g_t^2$

This is similar to RMSP as we keep track of an exponentially decaying average of past squared gradients.

4. Bias correction for $m_t$ and $v_t$

Especially at the beginning of training, $m_t$ and $v_t$ are biased toward zero (because the y are initialised at zero). This is corrected by Adam like this:

$\hat m = \frac{m_t}{1-\beta_1^t}$, $\hat v = \frac{v_t}{1-\beta_2^t}$

5. Parameter update

$w_{t} = w_{t_1} - \alpha_t\frac{\hat m_t}{(v_t+\epsilon)^{1/2}}*g_t$

where,

$g_t$ = gradient of the loss with respect to the parameters at iteration $t$

$\alpha_t$ = learning rate at iteration $t$

$\beta_1, \beta_2$ = decay rates for the moment estimates

$\epsilon$ = small constant to prevent division by zero

The tensorflow tool for Adam optimization accepts several arguments as input:

  • learning_rate: can be a float or a scheduler that optimizes the learning rate

  • beta_1: A value or constant tensor (float) that tells the exponential decay rate for the 1st moment estimates, i.e. the means of the gradients. Default = 0.9.

  • beta_2 = A value or constant tensor (float) that tells the exponential decay rate for the 2nd moment estimates, i.e. the uncentered variance of the squared gradients. Default 0.999.

  • amsgrad = True/False. Wether the AMSGrad variant of the algorithm presented in the paper On the Convergence of Adam and beyond shall be applied. Default = False.

  • weight_decay = If set the weight decay will be set.

Other things to consider when optimising¶

The batch size determines how many training examples are processed before the model's internal parameters are updated. Smaller batch sizes can speed up the training per epoch because the model updates more frequently. However, this can lead to less stable convergence, i.e. the training loss may fluctuate more. A small batch size can be beneficial in case the model is overfitting (the trianing loss is significantly lower than the validation loss).

A larger batch size leads to slower training per epoch and requires moe memory, but can yield more stable updates for the parameters. The model usually converges more smoothly, but might not generalise as well due to "sharp minima".

Another way to tune the parameters of optimization is to use learning rate schedulers. Why? As training progresses, the model gets closer to a good solution. Smaller learning rates allow for finer adjustments to the model's weights, helping it converge to a better minimum without overshooting (see the gradient descent examples in the beginning). I have implemented a learning rate algorithm that reduces the learning rate if the validation loss does not improve in 5 epochs.

After training, the model is saved in a .keras file. The .keras is a zip archive that contains:

  • The architecture
  • The weights
  • The optimizer's status
# @title Compiling and training with 3 epochs
run_example = False
if run_example:
  adam = tf.keras.optimizers.Adam(learning_rate = 0.0001, beta_1 = 0.9, \
                                  beta_2 = 0.999, amsgrad = False)
  model_3_facialKeyPoints = Model(inputs = X_input, outputs = X)
  model_3_facialKeyPoints.compile(loss = "mean_squared_error", optimizer = adam, \
                                  metrics = ['accuracy'])

  #Save the best model with least validation loss here
  checkpoint  = ModelCheckpoint(filepath = "Models/FacialKeyPoints_model_3.keras", \
                                verbose = 1, save_best_only = True)

  history3 = model_3_facialKeyPoints.fit(X_train_kp, y_train_kp, batch_size = 32, \
                    epochs = 3, validation_split = 0.05, callbacks=[checkpoint])
No description has been provided for this image
# @title Compiling and training with batch_size = 64, epochs = 100, and decay on plateu of the learning rate

if retrain_model:

  initial_learning_rate=0.0008

  # compile model
  adam = tf.keras.optimizers.Adam(learning_rate = initial_learning_rate, beta_1 = 0.9, \
                                  beta_2 = 0.999, amsgrad = False)
  model_1_facialKeyPoints = Model(inputs = X_input, outputs = X)
  model_1_facialKeyPoints.compile(loss = "mean_squared_error", optimizer = adam, \
                                  metrics = ['accuracy'])
  # Callbacks: reduce lr on plateau
  reduce_lr = ReduceLROnPlateau(
      monitor='val_loss',
      factor=0.65,
      patience=5,
      min_lr=1e-8,
      verbose=1
  )

  early = EarlyStopping(
    monitor='val_loss',
    patience=12,
    restore_best_weights=True,
    verbose=1,
    mode = 'min'
  )


  # Cakkbacks: save best model
  checkpoint = ModelCheckpoint(
      filepath="Models/FacialKeyPoints_model_1.keras",
      verbose=1,
      save_best_only=True
  )

  # Callbacks: logs epoch results to CSV
  csv_logger = CSVLogger(
      'Models/training_history_model_1.csv',
      append=True,         # keep adding if file exists
      separator=','        # comma-separated
  )
  # fit with CSVLogger included
  history = model_1_facialKeyPoints.fit(
      X_train_kp, y_train_kp,
      batch_size=64,
      epochs=100,
      validation_split=0.05,
      callbacks=[checkpoint, reduce_lr, csv_logger, early]
  )
print(X_train_kp.shape)   # e.g. (N, 96, 96, 1)
print(y_train_kp.shape)   # should print (N, 30)$
(8560, 96, 96, 1)
(8560, 30)
No description has been provided for this image

Assessing the trained key facial points detection model performance¶

# load the model architecture f = final
adam = tf.keras.optimizers.Adam(learning_rate = 0.0001, beta_1 = 0.9, \
                                beta_2 = 0.999, amsgrad = False)
model_1_facialKeyPoints = tf.keras.models.load_model("Models/FacialKeyPoints_model_1.keras")
model_1_facialKeyPoints.compile(loss = "mean_squared_error", optimizer = adam, \
                                metrics = ['accuracy'])
# Evaluate the model
# The model from materials has loss: 8.3705 accuracy: 0.85280377 with the X_test,y_test set.

result = model_1_facialKeyPoints.evaluate(X_test_kp, y_test_kp)
67/67 ━━━━━━━━━━━━━━━━━━━━ 14s 168ms/step - accuracy: 0.8088 - loss: 36.7290
67/67 ━━━━━━━━━━━━━━━━━━━━ 13s 188ms/step
predicted_kp = pd.DataFrame(predicted_kp, columns=columns)
predicted_kp
left_eye_center_x left_eye_center_y right_eye_center_x right_eye_center_y left_eye_inner_corner_x left_eye_inner_corner_y left_eye_outer_corner_x left_eye_outer_corner_y right_eye_inner_corner_x right_eye_inner_corner_y ... nose_tip_x nose_tip_y mouth_left_corner_x mouth_left_corner_y mouth_right_corner_x mouth_right_corner_y mouth_center_top_lip_x mouth_center_top_lip_y mouth_center_bottom_lip_x mouth_center_bottom_lip_y
0 27.774408 41.608398 69.701347 39.645927 36.824760 42.768158 18.683451 43.374584 61.305145 41.332291 ... 51.949944 64.388763 33.121056 88.780327 69.142967 87.274277 51.623299 84.660240 52.020306 94.269226
1 63.867874 59.837025 29.558842 56.851753 57.666470 58.394218 70.214401 59.508854 35.728714 56.771961 ... 48.389038 41.098076 64.282990 24.717131 35.385838 22.573711 49.672440 26.816153 50.343761 18.170223
2 66.701538 38.380436 30.169611 36.045486 60.238506 39.013817 73.072281 39.661934 36.369148 37.416374 ... 47.273544 59.190262 60.052547 82.078560 32.966862 80.368378 46.807594 75.696205 46.202118 88.636070
3 29.851620 40.825123 68.809959 39.051147 37.333893 41.556324 22.312654 42.364140 61.609417 40.434784 ... 49.816231 61.445831 34.389439 85.753067 66.853050 84.536171 50.709244 78.521538 50.523693 94.321213
4 29.966978 34.698605 66.767426 40.271515 37.919559 37.025856 21.968967 34.297169 59.200417 40.350548 ... 47.481728 60.271687 26.436142 72.177620 59.407753 76.562050 43.661579 75.111107 42.907543 80.085808
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2135 65.749184 62.595676 51.449287 25.544937 62.024063 55.722618 67.485916 70.113510 53.356335 32.962105 ... 39.002796 52.028568 25.570807 73.766769 13.956963 43.304741 23.576105 56.880741 12.931511 61.951336
2136 69.661598 39.893448 32.801315 33.998451 63.678589 39.639263 75.621635 41.654362 39.119526 35.907803 ... 50.316917 56.251945 60.237221 76.699028 31.409067 72.033859 47.122726 68.871414 45.235683 83.522102
2137 67.449257 36.665802 29.131422 36.421165 60.840542 36.884846 74.299210 37.797375 35.730961 36.823273 ... 46.935883 49.284607 63.658432 73.521011 32.097244 73.253479 47.337315 65.122261 46.956753 81.887756
2138 69.481331 44.736366 36.860580 30.242250 64.033051 42.824123 74.173164 47.903767 41.879425 33.272594 ... 46.418869 50.943565 52.048347 73.044212 26.092253 61.490208 40.972645 61.946621 35.929371 74.580406
2139 30.817675 35.639484 64.787453 37.358612 37.058781 37.048462 24.390446 36.065872 58.306843 38.046783 ... 45.163116 57.671455 31.936731 73.423302 60.704617 74.940430 45.779903 71.358398 45.488403 81.041809

2140 rows × 30 columns

# @title Printing out samples of predictions
fig, axes = plt.subplots(4,4, figsize=(10,10))
axes = axes.ravel()

for i in range(16):
  axes[i].imshow(X_test_kp[i].reshape(96,96), cmap='gray')
  axes[i].axis('off')
  for j in range(1,31,2):
      axes[i].plot(predicted_kp.iloc[i,j-1],predicted_kp.iloc[i,j], marker='.', color='r')

#plt.tight_layout()
plt.show()
No description has been provided for this image

Part 2. Facial Expression detection¶

In this second part of the project, I train the second model which will classify emotions. The data contains images that belong to 5 categories:

  • 0 = Angry
  • 1 = Disgust
  • 2 = Sad
  • 3 = Happy
  • 4 = Surprise

The images in the data set are of size 48px * 48px. Therefore they need to be resized so that we can run the Expression detection model with the Key facial point detection model together.

Below is an example of an original image, results from resizing and final image after interpolation.

No description has been provided for this image

Visualising the images in the dataset with the emotions¶

No description has been provided for this image
expression_df.head()
emotion pixels
0 0 [[69.316925, 73.03865, 79.13719, 84.17186, 85....
1 0 [[151.09435, 150.91393, 150.65791, 148.96367, ...
2 2 [[23.061905, 25.50914, 29.47847, 33.99843, 36....
3 2 [[20.083221, 19.079437, 17.398712, 17.158691, ...
4 3 [[76.26172, 76.54747, 77.001785, 77.7672, 78.4...

Below is the counts of each emotion category. Our data is extremely unbalanced with very few images portraying disgust and many images within category happy.

No description has been provided for this image

Data preparation and image augmentation¶

X shape (24568, 96, 96, 1)
y shape (24568, 5)
X train shape (22111, 96, 96, 1)
y train shape (22111, 5)
X val shape (1228, 96, 96, 1)
y val shape (1228, 5)
X test shape (1229, 96, 96, 1)
y test shape (1229, 5)

Data preprocessing¶

In the data preprocessing I will again normalize the data and perform image augmentation, as was done in the Part 1. of the project.

First, I normalize the data to conatin values between 0 and 1. Then, I use the following image augmentation techniques:

  1. rotating up to 15 degrees
  2. shifting the image horisontally up to 0.1*image width
  3. shifting the image vertically up to 0.1*image height
  4. shearing the image up to 0.1
  5. zooming the image up to 10 %
  6. horisontally flipping the image
  7. vertically flipping the image
  8. Adjusting the brightness

The spaces outside the boundaries are filled by replicting the nearest pixels.

Build and train Deep Learning model for facial expression classification¶

The model I will build has the following architecture:

%3 cluster_final_model Emotion Detection model input INPUT zeropad Zero padding input->zeropad conv2d Conv2D zeropad->conv2d bn_relu BatchNorm, ReLU conv2d->bn_relu pool MaxPool2D bn_relu->pool Res1 Res-block pool->Res1 Res2 Res-block Res1->Res2 Avgpool AveragePooling2D Res2->Avgpool flatten Flatten() Avgpool->flatten dense1 Dense, ReLU, Dropout flatten->dense1 output OUTPUT dense1->output
# @title Emotion recognition model

input_shape = (96,96,1)

# Input tensor shape
X_input = Input(input_shape)

# Zero-padding
X = ZeroPadding2D((3,3))(X_input)

# Stage 1
X = Conv2D(64, (7,7), strides = (2,2), name = 'conv1', kernel_initializer=glorot_uniform(seed=0))(X)
X = BatchNormalization(axis = 3, name = 'bn1')(X)
X = Activation('relu')(X)
X = MaxPooling2D((3,3), strides = (2,2))(X)

# Stage 2
X = res_block(X, filter = [64,64,256], stage = 'res2')

# Stage 3
X = res_block(X, filter = [128,128,512], stage = 'res3')

# Stage 4 (optional)
#X = res_block(X, filter= [256,256,1024], stage = 'res4')

# Average pooling
X = AveragePooling2D((4,4), name = 'avg_pool')(X)

# Final layer
X = Flatten()(X)
X  = Dense(5, activation = 'softmax', name = 'dense', kernel_initializer=glorot_uniform(seed=0))(X)

Emotion_det_model_2 = Model(inputs = X_input, outputs = X, name = 'Resnet18')
Model: "Resnet18"
****************************************************************************
┃ Layer (type)        ┃ Output Shape      ┃    Param # ┃ Connected to      ┃
****************************************************************************
| input_layer_1       | (None, 96, 96, 1) |          0 | -                 |
| (InputLayer)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| zero_padding2d_1    | (None, 102, 102,  |          0 | input_layer_1[0]… |
| (ZeroPadding2D)     | 1)                |            |                   |
+---------------------+-------------------+------------+-------------------+
| conv1 (Conv2D)      | (None, 48, 48,    |      3,200 | zero_padding2d_1… |
|                     | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| bn1                 | (None, 48, 48,    |        256 | conv1[0][0]       |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_19       | (None, 48, 48,    |          0 | bn1[0][0]         |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_5     | (None, 23, 23,    |          0 | activation_19[0]… |
| (MaxPooling2D)      | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 23, 23,    |      4,160 | max_pooling2d_5[… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_6     | (None, 11, 11,    |          0 | res2convblock_co… |
| (MaxPooling2D)      | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_a  | (None, 11, 11,    |        256 | max_pooling2d_6[… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_20       | (None, 11, 11,    |          0 | res2convblock_bn… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 11, 11,    |     36,928 | activation_20[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_b  | (None, 11, 11,    |        256 | res2convblock_co… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_21       | (None, 11, 11,    |          0 | res2convblock_bn… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 23, 23,    |     16,640 | max_pooling2d_5[… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_conv… | (None, 11, 11,    |     16,640 | activation_21[0]… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_7     | (None, 11, 11,    |          0 | res2convblock_co… |
| (MaxPooling2D)      | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_c  | (None, 11, 11,    |      1,024 | res2convblock_co… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2convblock_bn_s… | (None, 11, 11,    |      1,024 | max_pooling2d_7[… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_6 (Add)         | (None, 11, 11,    |          0 | res2convblock_bn… |
|                     | 256)              |            | res2convblock_bn… |
+---------------------+-------------------+------------+-------------------+
| activation_22       | (None, 11, 11,    |          0 | add_6[0][0]       |
| (Activation)        | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_conv_a    | (None, 11, 11,    |     16,448 | activation_22[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_bn_a      | (None, 11, 11,    |        256 | res2iden1_conv_a… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_23       | (None, 11, 11,    |          0 | res2iden1_bn_a[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_conv_b    | (None, 11, 11,    |     36,928 | activation_23[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_bn_b      | (None, 11, 11,    |        256 | res2iden1_conv_b… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_24       | (None, 11, 11,    |          0 | res2iden1_bn_b[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_conv_c    | (None, 11, 11,    |     16,640 | activation_24[0]… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden1_bn_c      | (None, 11, 11,    |      1,024 | res2iden1_conv_c… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_7 (Add)         | (None, 11, 11,    |          0 | res2iden1_bn_c[0… |
|                     | 256)              |            | activation_22[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_25       | (None, 11, 11,    |          0 | add_7[0][0]       |
| (Activation)        | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_conv_a    | (None, 11, 11,    |     16,448 | activation_25[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_bn_a      | (None, 11, 11,    |        256 | res2iden2_conv_a… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_26       | (None, 11, 11,    |          0 | res2iden2_bn_a[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_conv_b    | (None, 11, 11,    |     36,928 | activation_26[0]… |
| (Conv2D)            | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_bn_b      | (None, 11, 11,    |        256 | res2iden2_conv_b… |
| (BatchNormalizatio… | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_27       | (None, 11, 11,    |          0 | res2iden2_bn_b[0… |
| (Activation)        | 64)               |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_conv_c    | (None, 11, 11,    |     16,640 | activation_27[0]… |
| (Conv2D)            | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res2iden2_bn_c      | (None, 11, 11,    |      1,024 | res2iden2_conv_c… |
| (BatchNormalizatio… | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_8 (Add)         | (None, 11, 11,    |          0 | res2iden2_bn_c[0… |
|                     | 256)              |            | activation_25[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_28       | (None, 11, 11,    |          0 | add_8[0][0]       |
| (Activation)        | 256)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 11, 11,    |     32,896 | activation_28[0]… |
| (Conv2D)            | 128)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_8     | (None, 5, 5, 128) |          0 | res3convblock_co… |
| (MaxPooling2D)      |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_a  | (None, 5, 5, 128) |        512 | max_pooling2d_8[… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_29       | (None, 5, 5, 128) |          0 | res3convblock_bn… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 5, 5, 128) |    147,584 | activation_29[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_b  | (None, 5, 5, 128) |        512 | res3convblock_co… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_30       | (None, 5, 5, 128) |          0 | res3convblock_bn… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 11, 11,    |    131,584 | activation_28[0]… |
| (Conv2D)            | 512)              |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_conv… | (None, 5, 5, 512) |     66,048 | activation_30[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| max_pooling2d_9     | (None, 5, 5, 512) |          0 | res3convblock_co… |
| (MaxPooling2D)      |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_c  | (None, 5, 5, 512) |      2,048 | res3convblock_co… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3convblock_bn_s… | (None, 5, 5, 512) |      2,048 | max_pooling2d_9[… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_9 (Add)         | (None, 5, 5, 512) |          0 | res3convblock_bn… |
|                     |                   |            | res3convblock_bn… |
+---------------------+-------------------+------------+-------------------+
| activation_31       | (None, 5, 5, 512) |          0 | add_9[0][0]       |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_conv_a    | (None, 5, 5, 128) |     65,664 | activation_31[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_bn_a      | (None, 5, 5, 128) |        512 | res3iden1_conv_a… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_32       | (None, 5, 5, 128) |          0 | res3iden1_bn_a[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_conv_b    | (None, 5, 5, 128) |    147,584 | activation_32[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_bn_b      | (None, 5, 5, 128) |        512 | res3iden1_conv_b… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_33       | (None, 5, 5, 128) |          0 | res3iden1_bn_b[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_conv_c    | (None, 5, 5, 512) |     66,048 | activation_33[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden1_bn_c      | (None, 5, 5, 512) |      2,048 | res3iden1_conv_c… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_10 (Add)        | (None, 5, 5, 512) |          0 | res3iden1_bn_c[0… |
|                     |                   |            | activation_31[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_34       | (None, 5, 5, 512) |          0 | add_10[0][0]      |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_conv_a    | (None, 5, 5, 128) |     65,664 | activation_34[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_bn_a      | (None, 5, 5, 128) |        512 | res3iden2_conv_a… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_35       | (None, 5, 5, 128) |          0 | res3iden2_bn_a[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_conv_b    | (None, 5, 5, 128) |    147,584 | activation_35[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_bn_b      | (None, 5, 5, 128) |        512 | res3iden2_conv_b… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| activation_36       | (None, 5, 5, 128) |          0 | res3iden2_bn_b[0… |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_conv_c    | (None, 5, 5, 512) |     66,048 | activation_36[0]… |
| (Conv2D)            |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| res3iden2_bn_c      | (None, 5, 5, 512) |      2,048 | res3iden2_conv_c… |
| (BatchNormalizatio… |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| add_11 (Add)        | (None, 5, 5, 512) |          0 | res3iden2_bn_c[0… |
|                     |                   |            | activation_34[0]… |
+---------------------+-------------------+------------+-------------------+
| activation_37       | (None, 5, 5, 512) |          0 | add_11[0][0]      |
| (Activation)        |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| avg_pool            | (None, 1, 1, 512) |          0 | activation_37[0]… |
| (AveragePooling2D)  |                   |            |                   |
+---------------------+-------------------+------------+-------------------+
| flatten_1 (Flatten) | (None, 512)       |          0 | avg_pool[0][0]    |
+---------------------+-------------------+------------+-------------------+
| dense (Dense)       | (None, 5)         |      2,565 | flatten_1[0][0]   |
└---------------------┴-------------------┴------------┴-------------------┘
 Total params: 1,174,021 (4.48 MB)
 Trainable params: 1,165,445 (4.45 MB)
 Non-trainable params: 8,576 (33.50 KB)


  
print(f"Training samples: {len(X_train_ed)}")
print(f"Batch size: {64}")
steps_per_epoch=np.ceil(len(X_train_ed) / 64).astype(int)
print(f"Steps per epoch: {steps_per_epoch}")
Training samples: 22111
Batch size: 64
Steps per epoch: 346
dict_keys(['epoch', 'accuracy', 'lear_rate', 'loss', 'val_accuracy', 'val_loss'])

Evaluate model¶

Confusion matrix, accuracy, precision, and recall

<matplotlib.legend.Legend at 0x7df1d6632310>
No description has been provided for this image
39/39 ━━━━━━━━━━━━━━━━━━━━ 8s 156ms/step - accuracy: 0.7534 - loss: 0.5960
39/39 ━━━━━━━━━━━━━━━━━━━━ 7s 147ms/step
Text(70.72222222222221, 0.5, 'True')
No description has been provided for this image
No description has been provided for this image
print(classification_report(true_classes, predicted_classes))
              precision    recall  f1-score   support

           0       0.68      0.65      0.66       245
           1       0.46      0.27      0.34        22
           2       0.62      0.72      0.67       319
           3       0.86      0.84      0.85       458
           4       0.87      0.77      0.81       185

    accuracy                           0.75      1229
   macro avg       0.70      0.65      0.67      1229
weighted avg       0.76      0.75      0.75      1229

The above table tells us that the classes where we had the least data (# support) have the weakest performance. Precision (percentage of samples predicted to be class x that are actually x) and recall (percentage of x samples in data that are correctly labeled as x) are highest in class 3 where we also had the most samples. f1 -score is the harmonic mean of precision and recall and it is calculated as

$F_1 = \frac{\text{precision} \ \times \ \text{recall}}{\text{precision} \ +\ \text{recall}}$

Part 3. Combining the key point detection and facial expression recognition models¶

#HEAD

def predict(X_test):
# Predicting the keypoints
  df_predict = model_1_facialKeyPoints.predict(X_test)

# Predicting the emotion
  df_emotion = np.argmax(model_emotion.predict(X_test), axis = -1)

# Rehaping array from (856,) to (856,1)
  df_emotion = np.expand_dims(df_emotion, axis = 1)

# Converting the predictions into a dataframe
  df_predict = pd.DataFrame(df_predict, columns=columns)

# Adding emotion into the predicted dataframe
  df_predict['emotion'] = df_emotion

  return df_predict
df_predict = predict(X_test_ed)
39/39 ━━━━━━━━━━━━━━━━━━━━ 6s 158ms/step
39/39 ━━━━━━━━━━━━━━━━━━━━ 6s 150ms/step
df_predict.head()
left_eye_center_x left_eye_center_y right_eye_center_x right_eye_center_y left_eye_inner_corner_x left_eye_inner_corner_y left_eye_outer_corner_x left_eye_outer_corner_y right_eye_inner_corner_x right_eye_inner_corner_y ... nose_tip_y mouth_left_corner_x mouth_left_corner_y mouth_right_corner_x mouth_right_corner_y mouth_center_top_lip_x mouth_center_top_lip_y mouth_center_bottom_lip_x mouth_center_bottom_lip_y emotion
0 66.610443 40.496799 36.849850 25.179127 59.685791 38.832352 72.507202 43.841587 42.234673 28.970411 ... 51.316212 49.191902 70.809021 23.674339 58.018810 36.823456 64.206169 32.734852 67.690857 3
1 64.653381 38.042400 29.781794 34.580994 57.754303 38.556492 71.875885 39.505543 36.547447 36.649067 ... 59.056896 58.322350 77.191154 28.911465 74.403412 43.987129 74.182167 43.264111 80.884216 0
2 60.960003 37.274998 33.889053 33.123466 55.079700 37.744537 66.276901 38.548401 38.508621 35.537212 ... 59.973900 51.810158 79.506180 30.799675 76.300377 40.469517 78.190208 40.058804 79.434502 2
3 56.973576 37.822292 25.538336 38.267265 50.197941 38.189537 64.570473 37.789585 32.019382 38.396935 ... 44.175758 59.712654 48.962044 30.902285 48.933670 44.473202 48.845020 45.226284 49.808178 3
4 62.833561 40.498875 31.723818 37.697029 56.710606 41.063759 69.142136 41.633846 37.679863 39.783356 ... 58.978386 57.483551 75.107430 30.617804 73.028336 44.611393 72.902473 44.198174 77.997261 0

5 rows × 31 columns

Plotting test images of the combined models.

No description has been provided for this image

Part 4.Saving trained model for deployment and making requests¶

With TensorFlow Serving the algorithm can be easily deployed to make predictions. For this purpose I will save the model in a format supported by TensorFlow Serving. The model will have a version number and will be saved in a structured directory. After the model is saved, it can be used for predicting making a version of the model "servable".

For the run, following parameters have to be defined:

  • rest_api_port = the port that is used for REST request
  • model_name = used in the URL of REST request
  • model_base_path = this is the path to the directory where the model is saved

REST (REpresentational State Transfer) is a revival of HTTP where the commands are translated to semantic meaning.

In-order to predict using the TensorFlow serving, the objects need to be passed as a JSON object. Then we use python requests library to make a post request to teh deployed model, by passing in the JSON object containing inference requests (image data). Finally the prediction is obtained from the post request and the predicted class is found with the argmax function.

# @title Save the trained model in Saved model format

def deploy(directory, model):
  MODEL_DIR = directory
  version = 1
  #Join #v with the temporary model directory
  export_path = os.path.join(MODEL_DIR, str(version))
  print('export_path = {}\n'.format(export_path))

  # Save the model

  if os.path.isdir(export_path):
    print('\nAlready saved a model, cleaning up\n')
    !rm -r "{export_path}"

  tf.saved_model.save(model, export_path)

  os.environ["MODEL_DIR"] = MODEL_DIR
deploy("/content/drive/MyDrive/Colab Notebooks/Emotion-AI/Models/Model1", model_1_facialKeyPoints)
export_path = /content/drive/MyDrive/Colab Notebooks/Emotion-AI/Models/Model1/1


Already saved a model, cleaning up

%%bash --bg
nohup tensorflow_model_server \
  --rest_api_port=4500 \
  --model_name=FacialKeyPoints_model\
  --model_base_path="${MODEL_DIR}" >server.log 2>&1
!tail server.log
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1748268374.120064  153283 loader_harness.cc:71] Approving load for servable version {name: FacialKeyPoints_model version: 1}
I0000 00:00:1748268374.120568  153283 loader_harness.cc:79] Loading servable version {name: FacialKeyPoints_model version: 1}
I0000 00:00:1748268374.233872  153283 mlir_graph_optimization_pass.cc:425] MLIR V1 optimization pass is not enabled
I0000 00:00:1748268375.042949  153283 loader_harness.cc:105] Successfully loaded servable version {name: FacialKeyPoints_model version: 1}
[warn] getaddrinfo: address family for nodename not supported
[evhttp_server.cc : 261] NET_LOG: Entering the event loop ...
!curl -s localhost:4500/v1/models/facial_keypoints_model
deploy("/content/drive/MyDrive/Colab Notebooks/Emotion-AI/Models/Model1", model_emotion)
export_path = /content/drive/MyDrive/Colab Notebooks/Emotion-AI/Models/Model1/1


Already saved a model, cleaning up

%%bash --bg
nohup tensorflow_model_server \
  --rest_api_port=4000 \
  --model_name=emotion_detection_model \
  --model_base_path="${MODEL_DIR}" >server.log 2>&1
!tail server.log
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1748268529.568974  157016 loader_harness.cc:71] Approving load for servable version {name: emotion_detection_model version: 1}
I0000 00:00:1748268529.569295  157016 loader_harness.cc:79] Loading servable version {name: emotion_detection_model version: 1}
I0000 00:00:1748268529.730752  157016 mlir_graph_optimization_pass.cc:425] MLIR V1 optimization pass is not enabled
I0000 00:00:1748268530.692729  157016 loader_harness.cc:105] Successfully loaded servable version {name: emotion_detection_model version: 1}
[warn] getaddrinfo: address family for nodename not supported
[evhttp_server.cc : 261] NET_LOG: Entering the event loop ...
# @title Making requests to model in tensorflow serving

# Creating a JSON object and making 3 inference requests
data = json.dumps({"signature_name": "serving_default", "instances": X_test_ed[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
Data: {"signature_name": "serving_default", "instances": ... 97], [0.1099964827299118], [0.10756141692399979]]]]}
make_new_branch = False
if make_new_branch:
  # 1. Create a brand-new “orphan” branch (no history)
  !git checkout --orphan clean-start

  # 2. Stage everything in your current working directory
  !git add -A

  # 3. Commit it as your one “fresh start” commit
  !git commit -m "Fresh start: keep only current work"

  # 4. Force-push this new branch to overwrite remote main
  !git push https://$token@github.com/KaisuH/Emotion-AI.git clean-start:main --force

  # 5. (Optional) Switch back to ‘main’ locally and delete the temp branch
  !git checkout main
  !git branch -D clean-start